6 research outputs found

    From Input to Failure: Explaining Program Behavior via Cause-Effect Chains

    Get PDF
    Debugging a fault in a program is an error-prone and resource-intensive process that requires considerable work. My doctoral research aims at supporting developers during this process by integrating test generation as a feedback loop into a novel fault diagnosis to narrow down the causality by validating or disproving suggested hypotheses. I will combine input, output, and state to detect relevant relations for an immersive fault diagnosis. Further, I want to introduce an approach for a targeted test that leverages statistical fault localization to extract oracles based on execution features to identify failing tests

    SFLKit: A Workbench for Statistical Fault Localization

    Get PDF
    Statistical fault localization aims at detecting execution features that correlate with failures, such as whether individual lines are part of the execution. We introduce SFLKit, an out-of-the-box workbench for statistical fault localization. The framework provides straight- forward access to the fundamental concepts of statistical fault lo- calization. It supports five predicate types, four coverage-inspired spectra, like lines, and 38 similarity coefficients, e.g., TARANTULA or OCHIAI, for statistical program analysis. SFLKit separates the execution of tests from the analysis of the re- sults and is therefore independent of the used testing framework. It leverages program instrumentation to enable the logging of events and derives the predicates and spectra from these logs. This instru- mentation allows for introducing multiple programming languages and the extension of new concepts in statistical fault localization. Currently, SFLKit supports the instrumentation of python programs. SFLKit is highly configurable, requiring only the logging of the re- quired events

    Tests4Py: A Benchmark for System Testing

    Full text link
    Benchmarks are among the main drivers of progress in software engineering research, especially in software testing and debugging. However, current benchmarks in this field could be better suited for specific research tasks, as they rely on weak system oracles like crash detection, come with few unit tests only, need more elaborative research, or cannot verify the outcome of system tests. Our Tests4Py benchmark addresses these issues. It is derived from the popular BugsInPy benchmark, including 30 bugs from 5 real-world Python applications. Each subject in Tests4Py comes with an oracle to verify the functional correctness of system inputs. Besides, it enables the generation of system tests and unit tests, allowing for qualitative studies by investigating essential aspects of test sets and extensive evaluations. These opportunities make Tests4Py a next-generation benchmark for research in test generation, debugging, and automatic program repair.Comment: 5 pages, 4 figure

    Semantic Debugging

    Get PDF
    Why does my program fail? We present a novel and general technique to automatically determine failure causes and conditions, using logical properties over input elements: "The program fails if and only if int(⟨length⟩) > len(⟨payload⟩) holds - that is, the given ⟨length⟩ is larger than the ⟨payload⟩ length." Our AVICENNA prototype uses modern techniques for inferring properties of passing and failing inputs and validating and refining hypotheses by having a constraint solver generate supporting test cases to obtain such diagnoses. As a result, AVICENNA produces crisp and expressive diagnoses even for complex failure conditions, considerably improving over the state of the art with diagnoses close to those of human experts

    Semantic Debugging

    Get PDF
    Why does my program fail? We present a novel and general technique to automatically determine failure causes and conditions, using logical properties over input elements: "The program fails if and only if int(⟨length⟩) > len(⟨payload⟩) holds - that is, the given ⟨length⟩ is larger than the ⟨payload⟩ length." Our AVICENNA prototype uses modern techniques for inferring properties of passing and failing inputs and validating and refining hypotheses by having a constraint solver generate supporting test cases to obtain such diagnoses. As a result, AVICENNA produces crisp and expressive diagnoses even for complex failure conditions, considerably improving over the state of the art with diagnoses close to those of human experts
    corecore